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CLAIMS 

1 . A method of using V computer system to identify a microbe inhabiting a host organism, 
comprising the steps 1 

a) obtainirk sequence information from a plurality of sequences from at least 
one host organism; and 

b) searching a database of host organism genomic sequences to determine the 

s , presence onabsence of said plurality of sequences in said database, 

\ \ 

wherein the absence of at least one of said sequences m said database 

indicates thatWd at least one sequence is a candidate sequence belonging 

to a microbe. \^ 

2. A method of using a computer system to identify a microbe inhabiting a host organism, 

\ 

comprising the steps of: \ 

a) obtaining sequence information from a library of genomic DNA from a 
host organism suspected of harboring a microbe;and 

b) searching a database of host organism genomic sequences from host 
organisms which do not harbor the microbe to determine the presence or 
absence of a sequence inlaid library in said database: 

wherein the absence of said sequence indicates that said sequence is a . 
candidate microbe sequence. 

3 . A method of using a computer system to identify a microbe inhabiting a host organism, 
comprising the steps of: 

a) obtaining sequence information from a plurality of expressed sequences 
from at least one host organism; and 

b) searching a database of host organism genomic sequences to determine the 
presence or absence of said plurality of expressed sequences in said 
database, wherein the absence of at least one of said expressed sequences 
in said database indicates that said at least one sequence is a candidate 
sequence belonging to a microbe. 

4. The method according to claims 1, 2, or 3, wherein saidVnicrobe is a symbiotic organism. 
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5. The method according to claim 4 ? wherein said microbe is a mutualistic organism, a 
commensal organisiA or a parasitic organism. 

6. The method according to claim 1, 2, or 3 ? wherein said microbe is a pathogenic organism. 

7. The method according claims 1, 2, 3, wherein said plurality of sequences are compared to 
said database of host genomic sequences simultaneously. 

8. A method onysing a computer system to identify an intracellular pathogen, comprising 
the steps of: 

a) ol^aining sequence information from at least one host organism having a 
pathWenic condition; 

b) identifying sequences from said at least one host organism which are not 
found ima plurality of host organisms not having said pathogenic 
condition^ 

c) comparing s^ sequences identified in step (b) with a plurality of 
sequences in a^atabase of host genomic sequences; and 

d) eliminating identified sequences which match said host genomic 
sequences, wherein apy remaining sequences are identified as candidate 
pathogen sequences. 

9. The method according to claim 8, wherein said identified sequences are compared 
simultaneously with sequences in said database of host genomic sequences, 

10. The method according to claim 1 or 8, wherein said sequences are expressed sequences. 

1 1 . The method according\to claim L 3, or 8, wherein said expressed sequences are EST 
sequences. 

12. The method according to x claim 1, 3, or 8, wherein said expressed sequences are cDNA 
sequences. 

13. The method according to claim 1, 2, 3 ? or 8, wherein said host organism 'is an animal. 




14. The method according to claim\13 ? wherein said animal is a mammal. 
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15. The memod according to claim 14, wherein said mammal is a human. 

16. The method according to claim 13/wherein said animal is an insect, bird, or a fish. 

17. The methodWcording to claim 1 ? 2, 3, or 8, wherein said host organism is a 
microrganism, a fungus, or a plant. 

5 18. The method according to claim 11, wherein said candidate sequence is identified by 

comparing sequences in a database of expressed sequences with said sequences in said 
genomic 



1 9. The method according to claim 8, wherein said expressed sequence is identified using a 

differential gene expression assay. 

\ 

,40 20. The method according to claim 19, wherein said differential gene expression assay is 



jjf selected from the groiip consisting of SAGE, cDNA representational difference analysis, 

uj and suppression subtraction analysis. 

M \ 

fx 21 . The method according to tlaim 8, wherein said at least one sequence is identified using a 
" 3 siibtractive hybridization method. 

a \ 

=55 22. The method according to clakn 21, wherein said subtractive hybridization method is 



[ft representational difference analysis. 

us \ 

M 23. The method according to claim V 2, 3, or 8, wherein said candidate sequence is used as a 
query sequence to search a database of microbial sequences. 

24. The method according to claim 23\wherein said microbial sequences include viral 
20 sequences. 

25. The method according to claim 1, 2, 3\or 8, wherein any of: vector sequences, repetitive 
sequences, mitochondrial sequences, non-host species sequences, known host organism 
sequences, and combinations thereof are eliminated from the genomic database 
comprising sequences from the host organism. 
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29. 



31. 
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26. The method according to claim 1, 2, 3, or 8, wherein said searching is performed 
iteratively using progressively smaller word sizes. 

27. The method according to claim 1, 2, 3, or 8, wherein said candidate sequence is used to 
probe a libraW of sequences including sequences from at least one microbe. 

28. The method a<p5rd|ng to claim 27, wherein a sequence identified by said probe is used to 
express a peplide; 



The metftod according to claim 6 or 8, wherein said pathogen is an infectious disease 
organism.^ , 



The method according to claim 6 or 8, wherein said pathogen is associated with a 
pathogenic condition selected from the group consisting of an inflammatory disease, an 
autoimmune disease, and a cell proliferative disease. 

The method according to claim 30, wherein said disease is selected from the group 
consisting of sarcoidosis, inflammatory bowel disease, atherosclerosis, multiple sclerosis, 
rheumatoid arthritisAype 1 diabetes mellitus, lupus erythematosus, Hodgkin's disease, 
and bronchioalveolar sarcinorna. 



32. The method according to ^laim 1, 2, 3, or 8, wherein said candidate sequence is used to 
produce a peptide. 

33. The method according to ^im 1, 2, 3, or 8 wherein said candidate sequence is operably 
linked to a promoter sequenc^in an expression vector. 

34. The method according to claim 32, wherein said peptide is administered to the host 
organism in an amount effective to generate a protective immune response. 

35. The method according to claim ^[iMam;in said expression vector is administered to the 
host organism in an amount effectij/feto generate a protective immune response. 

36. The method according to claim 1, 3, or*, wherein the complementary sequence of a 
coding sequence of said candidate sequence is administered to the host organism in an 
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amount sufficient to prevent the expression of a polypeptide encoded by said candidate 
sequence in said host organism. 

The method according to claim 36, wherein said complementary sequence further 
comprises a cleaving moiety For cleaving RNA. 

The method according to clainnl, 2, 3, or 8, wherein said candidate sequence is 
hybridized to nucleic acids from\said host organism, and wherein the presence or absence 
of hybridization provides an indication of the presence or absence of said intracellular 
organism in a host cell from said fyost organism. 

A system, comprising: 

a) a first database comprising sequences from at least one host organism 

b) a second database comprising genomic sequences from said host 
organism; and 

c) an information management system comprising a search and subtraction 
function for eliminating sequences in said database comprising genomic 
sequences which are not found in ; 



The system according to claim 39, furtit£r\ 
connectable to the network. 



said-first database. 



rnp ising at least one user device 



O 41. The system according to claim 39, wMerein said ^ystem comprises a program capable of 
implementing an algorithm for comparing a plurality of sequences in the first database 
20 with all of the sequences in the second databases 

42. The system according to claim 41, wherein said system comprises a MEGABLAST 
program. 

43. The system according to claim 39, wherein said systeAj comprises a high speed, linear 
array processor. 

25 44. The system according to claim 39, wherein said system fuVther comprises a result 

sequence set comprising sequences in the first database whi^h do not match sequences in 
the genomic database. 
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45. The system according to claim 39, further comprising an identity matrix which requires a 
score of greater than or equal to 60. 

46. The system according to claim 39 or 45, wherein the system iteratively computes the 
degree of alignment between s&quences in the first and second database. 



5 47. 



48. 
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The system according to claim 4b, wherein iterative computing is performed using 
progressively smaller word sizes.t 



The system according to claim 39/ Jfchereii 
performing one or more electron 
sequences, repetitive sequences| 
organisms, and combinations tl 



Xbtrad 



litochon 



keof 



stem provides one or more programs for 
functions for eliminating any of: vector 
trial sequences, sequences from non-host 
■ the genomic database. 



49. 
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A computer program product comprising a computer readable memory on which is 
embedded one or more programs for implementing any of the system functions recited in 
claim 39 or 41. 

A method of using a computer system to identify a microbe inhabiting a host organism, 
comprising the keps of: 

obtaining \equence information from a plurality of expressed sequences from at 

least one host organism; and 

searching a database of host organism genomic sequences to determine the 
presence or absence of the plurality of expressed sequences in the database, 
wherein the absence of an expressed sequence in the database identifies the 
expressed sequenceas a candidate microbe sequence. 

The method according to claii\50, wherein said plurality of sequences are from a library 
of sequences. 



The method accordr 
expressed sequence: 




;o claim 51, wherein said library of sequences is a library of 



methc 



The method according to claim 51 or 52, wherein said library comprises human 
sequences^ 
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The method according to claim 53, wherein said library comprises human sequences 
from one or morit humans having a pathological condition. 

The method according to claim 54, wherein said pathological condition is a disease 
selected from the group consisting of an inflammatory disease, an autoimmune disease, 
and a cell proliferative disease. 

The method according to claim 55, wherein said disease is selected from the group 
consisting of sarcoidosis, inflammatory bowel disease, atherosclerosis, multiple sclerosis, 
rheumatoid arthritis, wpe I diabetes mellitus, lupus erythematosus, Hodgkin's disease, 
and bronchioalveolar carcinoma. 

The method according^) claim 50, wherein said step of obtaining sequence information 
comprises sequencing expressed sequences cloned in a library of expressed sequences. 



A method of using a computer system to identify a microbe inhabiting a host organism, 

comprising the steps of: 

obtaining expressed sequence information from a plurality of sequences from at 
least one non-microb^al host organism; and 
searching a database of microbial sequences to determine the presence or absence 
of the plurality of expressed sequences in the database, wherein the presence of an 
expressed sequence in the database identifies the expressed sequence as a 
candidate microbe sequence. 

The method according to claim gB, wherein said plurality of sequences are from a library 
of expressed sequences. 

The method according to claim 5 8,\ wherein said library of sequences comprises 
sequences from one or more human^ having a pathological condition. 

The method according to claim 60, ^herein said pathological condition is an infectious 
disease. 
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